Genotypic versus Behavioural Diversity for Teams of Programs under the 4-v-3 Keepaway Soccer Task

نویسندگان

  • Stephen Kelly
  • Malcolm I. Heywood
چکیده

Keepaway soccer is a challenging robot control task that has been widely used as a benchmark for evaluating multi-agent learning systems. The majority of research in this domain has been from the perspective of reinforcement learning (function approximation) and neuroevolution. One of the challenges under multi-agent tasks such as keepaway is to formulate effective mechanisms for diversity maintenance. Indeed the best results to date on this task utilize some form of neuroevolution with genotypic diversity. In this work, a symbiotic framework for evolving teams of programs is utilized with both genotypic and behavioural forms of diversity maintenance considered. Specific contributions of this work include a simple scheme for characterizing genotypic diversity under teams of programs and its comparison to behavioural formulations for diversity under the keepaway soccer task. Unlike previous research concerning diversity maintenance in genetic programming (GP), we are explicitly interested in solutions taking the form of teams of programs. Introduction Symbiotic Bid-Based GP (SBB) is a hierarchical framework for symbiotically coevoling teams of simple programs over two distinct cycles of evolution (Kelly, Lichodzijewski, and Heywood 2012). The first cycle produces a library of diverse, specialist teams with limited capability. The second cycle builds more general and robust policies by re-using the library, essentially building generalist teams from multiple specialists. Thus, diversity maintenance is critical during the first stage of evolution to ensure the identification of a wide range of specialist behaviours. Keepaway soccer is a challenging benchmark task for multi-agent learning in which a team of K keepers must maintain possession of the ball while an opposing team of K 1 takers attempt to gain possession (Stone et al. 2006). The keepers must learn a policy that maximizes the length of play against the takers, which follow a pre-specified behaviour. The players’ sensors and actuators are noisy, making the task partially observable and highly stochastic. The size of the playing region and number of keepers vs. takers may be adjusted to scale the difficulty of the task. In Copyright c 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. this work we are working with the 4-v-3 task configuration on a 25m x 25m field. The keepaway task has been dominated by value function-based approaches to reinforcement learning (Kalyanakrishnan and Stone 2009) and neuroevolution (Metzen et al. 2007). In the case of neuroevolution, genotypic diversity maintenance was shown to be a critical factor in the algorithm’s success. However, the maintenance of behavioural diversity is increasingly found to be more effective than diversity mechanisms operating solely in genotype space, in particular when the domain is deceptive, for example, due to a noisy fitness function (Gomez 2009). In this work we introduce novel methods for measuring genotypic and behavioural diversity among teams of programs, and empirically compare their efficacy under the keepaway task. Diversity Mechanisms In the keepaway task, the fitness of a team is the mean episode length over all games played. In order to promote population diversity, each team’s novelty must also factor into the selection process, where novelty refers to the mean genotypic or behavioural distance between a team and all other members of the same population. Several methods to balance fitness and novelty are possible, including fitness sharing, crowding, or multi-objective optimization. In this work we adopt a simple linear combination of fitness and novelty (Cuccu and Gomez 2011), thus each team’s score is defined prior to selection: score(tm

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Diversity, Teaming, and Hierarchical Policies: Observations from the Keepaway Soccer Task

The 3-versus-2 Keepaway soccer task represents a widely used benchmark appropriate for evaluating approaches to reinforcement learning, multi-agent systems, and evolutionary robotics. To date most research on this task has been described in terms of developments to reinforcement learning with function approximation or frameworks for neuro-evolution. This work performs an initial study using a r...

متن کامل

Keepaway Soccer: A Machine Learning Testbed

RoboCup simulated soccer presents many challenges to machine learning (ML) methods, including a large state space, hidden and uncertain state, multiple agents, and long and variable delays in the effects of actions. While there have been many successful ML applications to portions of the robotic soccer task, it appears to be still beyond the capabilities of modern machine learning techniques to...

متن کامل

Keepaway Soccer: From Machine Learning Testbed to Benchmark

Keepaway soccer has been previously put forth as a testbed for machine learning. Although multiple researchers have used it successfully for machine learning experiments, doing so has required a good deal of domain expertise. This paper introduces a set of programs, tools, and resources designed to make the domain easily usable for experimentation without any prior knowledge of RoboCup or the S...

متن کامل

Evolving Static Representations for Task Transfer

An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Previous approaches to transfer in Keepaway have focused on transforming the original representation to fit the new task. In contrast, this paper explores the idea that transfer is most effective if the repre...

متن کامل

Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study

We present half field offense, a novel subtask of RoboCup simulated soccer, and pose it as a problem for reinforcement learning. In this task, an offense team attempts to outplay a defense team in order to shoot goals. Half field offense extends keepaway [11], a simpler subtask of RoboCup soccer in which one team must try to keep possession of the ball within a small rectangular region, and awa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014